Lexical Analysis of Agglutinative Languages Using a Dictionary of Lemmas and Lexical Transducers
نویسندگان
چکیده
This paper presents a simple method for performing a lexical analysis of agglutinative languages like Korean, which have a heavy morphology. Especially, for nouns and adverbs with regular morphological modifications and/or high productivity, we do not need to artificially construct huge dictionaries of all inflected forms of lemmas. To construct a dictionary of lemmas and lexical transducers, first, we construct automatically a dictionary of all inflected forms from KAIST POS-Tagged Corpus. Secondly, we separate the party of lemmas and one of sequences of inflectional suffixes. Thirdly, we describe their lexical transducers (i.e., morphological rules) to recognize all inflected forms of lemmas for nouns and adverbs according to the combinatorial restrictions between lemmas and their inflectional suffixes. Finally, we evaluate the advantages of this method.
منابع مشابه
Lexical Cohesion in English and Persian Abstracts
This study compares and contrasts lexical cohesion in English and Persian abstracts of Iranian medical students’ theses to appreciate textualization processes in the two languages. For this purpose, one hundred English and Persian abstracts were selected randomly and analyzed based on Seddigh and Yarmohamadi’s (1996) lexical cohesion framework, a version of Halliday and Hasan’s (1976) and Halli...
متن کاملAutomatic morphological analysis of Basque
1 I n t r o d u c t i o n The two-level model of computational morphology was proposed by Koskenniemi (1983) and has found widespread acceptance due mostly to its general applicability, declarativeness of rules and clear separation of linguistic knowledge and program. The essential difference from generative phonology is that there are no intermediate states between lexical and surface represen...
متن کاملA functional operator-based morphological analysis of Japanese
A universal set of functional operators as proposed in Role and Reference Grammars can be used to provide a robust morphology analyser development scheme, which gives the developer of the analyser a clear guiding principle guaranteeing the exhaustiveness of his grammar from the inception of the development task, freeing him from the complex bookkeeping of continuation lexicons often associated ...
متن کاملA Probabilistic Translation Method for Dictionary-based Cross-lingual Information Retrieval in Agglutinative Languages
Translation ambiguity, out of vocabulary words and missing some translations in bilingual dictionaries make dictionary-based Crosslanguage Information Retrieval (CLIR) a challenging task. Moreover, in agglutinative languages which do not have reliable stemmers, missing various lexical formations in bilingual dictionaries degrades CLIR performance. This paper aims to introduce a probabilistic tr...
متن کاملComparative Study of Degree of Bilingualism in Lexical Retrieval and Language Learning Strategies
This study compares lexical retrieval amongst monolinguals and intermediate bilinguals and advanced bilinguals. It also investigates the possible effects of their language learning strategies on their respective lexical retrieval advantage. The study used a mixed methods design and the groups consisted of 20 Persian near-monolinguals, 20 Persian-English intermediate level bilinguals, and 20 Per...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004